Clustering Chinese Product Features with Multilevel Similarity
نویسندگان
چکیده
This paper presents an unsupervised hierarchical clustering approach for grouping co-referred features in Chinese product reviews. To handle different levels of connections between co-referred product features, we consider three similarity measures, namely the literal similarity, the word embedding-based semantic similarity and the explanatory evaluation based contextual similarity. We apply our approach to two corpora of product reviews in car and mobilephone domains. We demonstrate that combining multilevel similarity is of great value to feature normalization.
منابع مشابه
Extract Product Features in Chinese Web for Opinion Mining
In sentiment analysis of product reviews, one important problem is to extract people's opinions based on product features. Through the summary of feature-level opinions, different consumers can choose their favorite products according to the features that they care about. At the same time, manufacturers can also improve the product features based on the opinions. Different words may be used to ...
متن کاملFast Multilevel Clustering
Clustering is a difficult problem. Clustering data may differ by a variety of aspects (dimensionality, cluster size, noise, etc), and the criterion for clustering may depend on the context in which the data is given. We present a multilevel approach for clustering, easily adaptable to handle various kinds of data by identifying desired underlying features of the data. The scheme we present is g...
متن کاملSuffix Tree Based Chinese Document Feature Extraction and Clustering in RSS Aggregator
In RSS aggregator, the important issue is how to make the feeds information more manageable for RSS subscriber. In this paper, we propose a suffix tree based RSS feeds document clustering in Chinese RSS aggregator. We construct a suffix tree with meaningful Chinese words, and choose the phrases with high score given by a formula as document features. We cluster document using group-average algo...
متن کاملOffline Recognition of Chinese Handwriting by Multifeature and Multilevel Classification
One of the most challenging topics is the recognition of Chinese handwriting, especially offline recognition. In this paper, an offline recognition system based on multifeature and multilevel classification is presented for handwritten Chinese characters. Ten classes of multifeatures, such as peripheral shape features, stroke density features, and stroke direction features, are used in this sys...
متن کاملPolygonal Clustering Analysis Using Multilevel Graph-Partition
Existing methods of spatial data clustering have focused on point data, whose similarity can be easily defined. Due to the complex shapes and alignments of polygons, the similarity between non-overlapping polygons is important to cluster polygons. This study attempts to present an efficient method to discover clustering patterns of polygons by incorporating spatial cognition principles and mult...
متن کامل